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DETAILED ACTION 

1 . This office action is responsive to the amendment filed on September 6, 2007, claims 2, 
9, and 16 have been cancelled, claims 1, 3-4, 8, 10-12, 15, and 17-19 are amended, claims 1, 3-8, 
10-15, and 17-19 are pending and have been examined. 



Claim Rejections - 35 USC § 103 

The following is a quotation of 35 U.S.C. 103(a) which forms the basis for all 
obviousness rejections set forth in this Office action: 

(a) A patent may not be obtained though the invention is not identically disclosed or described as set forth in 
section 102 of this title, if the differences between the subject matter sought to be patented and the prior art are 
such that the subject matter as a whole would have been obvious at the time the invention was made to a person 
having ordinary skill in the art to which said subject matter pertains. Patentability shall not be negatived by the 
manner in which the invention was made. 

2. Claims 1, 3-8, 10-15, and 17-19 are rejected under 35 U.S.C. 103(a) as being 

unpatentable over Van Thong et al (Us Patent 6,490,553) in view of Reynar (US Patent No. 

6,446,041). 

Claim 1. Van Thong teaches, a method of dynamically and automatically adjusting a 
speech output rate match an speech input rate, comprising the steps of: 
receiving a speech input; (Fig. 2 Speech input 17) 

computing a speech input rate from the speech input; and ( Fig.2 Recognizer & Speech 
rate calculation Unit 41; analyses the recorded speech data and calculates the average 
speech rate. This unit may operate in real time, or the averaged instantaneous rate values 
may be computed ahead of time during the preprocessing step. Col. 10, lines 50-55) 
dynamically adjusting the speech output rate to match the speech input rate. (Fig. 2 Rate 
Adjusted Speech output 47; plays back recorded speech at a certain rate, this playback 
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rate is able to match the input rate so that expressions sound the same coming in and 
exiting the system). Van Thong does not teach whether the speech input is from an audio 
recording or computer generated text-to speech and determining whether a type of speech 
output to be provided at the speech output rate is the text-to-speech or the recorded speech 
output. Reynar discloses a multi-source input and playback utility that accepts inputs from 
various sources, transcribes the inputs as text, and plays aloud user-selected portions of the text 
is disclosed. The user may select a portion of the text and request audio playback thereof. The 
utility examines each transcribed word in the selected text. If stored audio data is associated 
with a given word, that audio data is retrieved and played. If no audio data is associated, then a 
text-to-speech entry or series of entries is retrieved and played instead. The system also 
provides for the utility may also speed up, slow down, or otherwise alter the TTS entry prior to 
playback in order to match the stored audio data. The utility may analyze the audio data 
waveform, extracting such information as speech speed, pitch, tone, and timbre. The utility 
may then alter these characteristics in the TTS entry in order to more closely parallel the sound 
of the TTS entry to a speaker's own speech patterns. It would have been obvious to one of 
ordinary skill at the time of the invention to modify the system of Van Thong to provide for 
text-to-speech audio and recorded audio, for the purpose of providing all necessary audio data 
of any desired words for use in the system. 

Claim 3. The combination of Van Thong and Reynar teaches, the method of claim 1, 
wherein the method further comprises the step of adjusting a rate of text-to-speech synthesis to 
match the speech input rate if the type of speech output is text-to-speech. (Fig.l; The next 
module, the speech control module 19, controls the rate of speech depending on how fast 
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the text is spoken and/or how fast the operator 53 types. Col 3 lines 55-54; Alternatively, 
the speech playback rate may depend on the external synchronization source such as the 
text-input of an operator transcribing the recorded speech. Col 12 lines 21 -23) 

Claim 4. The combination of Van Thong and Reynar teaches, the method of claim 1, 
wherein the method further comprises the step of counting alternate text available from a 
recorded output and determining an audio file length to compute a default output rate 
(Alternatively, the speech playback rate may depend on the external synchronization 
source such as the text-input of an operator transcribing the recorded speech. Col 12 lines 
21 -23) which is used to adjust a recorded output rate (Fig. 2 rate adjusted speech input 47) to 
match the input speech rate when the type of speech is recorded (Fig. 2 input speech 17) and 
alternate text is available. (The desired target speech rate 37 may be a "predefined value" or 
depend on external synchronization, here the keyboard input Le. text available (i.e. real 
time transcribed text) 49. Col. 5 lines 1-3) 

Claim 5. The combination of Van Thong and Reynar teaches, the method of claim 4, 
wherein the method further comprises the step of obtaining an output word count from a 
transcription of a recorded speech output and determining an audio file length to compute a 
default output rate (Alternatively, the speech playback rate may depend on the external 
synchronization source such as the text-input of an operator transcribing the recorded 
speech. Col 12 lines 21 -23) which is used to adjust a recorded output rate (Fig. 2 rate adjusted 
speech input 47) to match the input speech rate when the type of speech is recorded (Fig. 2 
input speech 17) and alternate text is unavailable (The desired target speech rate 37 may be a 
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"predefined value" Le. text not available or depend on external synchronization, here the 
keyboard input (i.e. real time transcribed text) 49. Col. 5 lines 1-3) 

Claim 6. The combination of Van Thong and Reynar teaches, the method of claim 1, 
wherein the step of compute the speech input rate comprises the step of computing a running 
average of the rates computed for the last n utterances of the speech input. (Fig.2 Recognizer & 
Speech rate calculation Unit 41, analyses the recorded speech data and calculates the 
average speech rate. This unit may operate in real time, or the averaged instantaneous rate 
values may be computed ahead of time during the preprocessing step. Col. 10, lines 50-55) 

Claim 7. The combination of Van Thong and Reynar teaches, the method of claim 1, 
wherein the method further comprises the step of feeding back an estimate of the speech input 
rate (Fig. 2 Speech rate calculation Unit element 41) to a speech production mechanism to 
adjust the speech output rate. (Fig. 2 rate adjusted speech output) 

Claim 8. Van Thong teaches, a system for dynamically and automatically adjusting an 
speech output rate to match an speech input rate, comprises: a memory; (Fig. 6 Laptop and 
memory storage devices) and a processor programmed to receives a speech input; (Fig. 2 
Speech input 17) 

computes a speech input rate from the speech input; and ( Fig.2 Recognizer & Speech 
rate calculation Unit 41; analyses the recorded speech data and calculates the average 
speech rate. This unit may operate in real time, or the averaged instantaneous rate values 
may be computed ahead of time during the preprocessing step. Col. 10, lines 50-55) 

dynamically adjusts the speech output rate to match the speech input rate. (Fig. 2 Rate 
Adjusted Speech output 47; plays back recorded speech at a certain rate, this playback rate 
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is able to match the input rate so that expressions sound the same coming in and exiting the 
system). Van Thong does not teach whether the speech input is from an audio recording or 
computer generated text-to speech and determining whether a type of speech output to be 
provided at the speech output rate is the text-to-speech or the recorded speech output. Reynar 
discloses a multi-source input and playback utility that accepts inputs from various sources, 
transcribes the inputs as text, and plays aloud user-selected portions of the text is disclosed. The 
user may select a portion of the text and request audio playback thereof. The utility examines 
each transcribed word in the selected text. If stored audio data is associated with a given word, 
that audio data is retrieved and played. If no audio data is associated, then a text-to-speech entry 
or series of entries is retrieved and played instead. The system also provides for the utility may 
also speed up, slow down, or otherwise alter the TTS entry prior to playback in order to match 
the stored audio data. The utility may analyze the audio data waveform, extracting such 
information as speech speed, pitch, tone, and timbre. The utility may then alter these 
characteristics in the TTS entry in order to more closely parallel the sound of the TTS entry to a 
speaker's own speech patterns. It would have been obvious to one of ordinary skill at the time of 
the invention to modify the system of Van Thong to provide for text-to-speech audio and 
recorded audio, for the purpose of providing all necessary audio data of any desired words for 
use in the system. 

Claim 10. The combination of Van Thong and Reynar teaches, the system of claim 8, 
wherein the processor is further programmed to adjust a rate of text-to-speech synthesis to match 
the speech input rate if the type of speech output is text-to-speech. (Fig.l; The next module, the 
speech control module 19, controls the rate of speech depending on how fast the text is 
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spoken and/or how fast the operator 53 types. Col 3 lines 55-54; Alternatively, the speech 
playback rate may depend on the external synchronization source such as the text-input of 
an operator transcribing the recorded speech. Col 12 lines 21 -23) 

Claim 11. The combination of Van Thong and Reynar teaches, the system of claim 8, 
wherein the processor is further programmed to count alternate text available from a recorded 
output and determine an audio file length to compute a default output rate (Alternatively, the 
speech playback rate may depend on the external synchronization source such as the text- 
input of an operator transcribing the recorded speech. Col 12 lines 21 -23) which is used to 
adjust a recorded output rate (Fig. 2 rate adjusted speech input 47) to match the input speech 
rate when the type of speech is recorded (Fig. 2 input speech 17) and alternate text is available. 
(The desired target speech rate 37 may be a "predefined value" or depend on external 
synchronization, here the keyboard input Le. text available (i.e. real time transcribed text) 
49. Col. 5 lines 1-3) 

Claim 12. The combination of Van Thong and Reynar teaches, the system of claim 8, 
wherein the processor is further programmed to obtain an output word count from a transcription 
of a recorded Speech output and determine an audio file length to compute a default output rate 
(Alternatively, the speech playback rate may depend on the external synchronization 
source such as the text-input of an operator transcribing the recorded speech. Col 12 lines 
21 -23) which is used to adjust a recorded output rate (Fig. 2 rate adjusted speech input 47) to 
match the input speech rate when the type of speech is recorded (Fig. 2 input speech 17) and 
alternate text is unavailable (The desired target speech rate 37 may be a "predefined value" 
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ue. text not available or depend on external synchronization, here the keyboard input (i.e. 
real time transcribed text) 49. Col. 5 lines 1-3) 

Claim 13. The combination of Van Thong and Reynar teaches, the system of claim 8 5 
wherein the processor is further programmed to compute a running average of the rates 
computed for the last n utterances of the speech input when computing the speech input rate. 
(Fig.2 Recognizer & Speech rate calculation Unit 41, analyses the recorded speech data and 
calculates the average speech rate. This unit may operate in real time, or the averaged 
instantaneous rate values may be computed ahead of time during the preprocessing step. 
Col. 10, lines 50-55) 

Claim 14. The combination of Van Thong and Reynar teaches, the system of claim 8, 
wherein the processor is further programmed to feed back an estimate of the speech input rate 
(Fig. 2 Speech rate calculation Unit element 41) to a speech production mechanism to adjust 
the speech output rate. (Fig. 2 rate adjusted speech output) 

Claim 15. Van Thong teaches, a machine-readable storage, having stored thereon a 
computer program having a plurality of code sections executable by a machine for causing the 
machine to perform(Fig. 6 Laptop and memory storage devices) the steps of receiving a 
speech input; (Fig. 2 Speech input 17) 

computing a speech input rate from the speech input; and ( Fig.2 Recognizer & Speech 
rate calculation Unit 41; analyses the recorded speech data and calculates the average 
speech rate. This unit may operate in real time, or the averaged instantaneous rate values 
may be computed ahead of time during the preprocessing step. Col. 10, lines 50-55) 
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dynamically adjusting the speech output rate to match the speech input rate. (Fig. 2 Rate 
Adjusted Speech output 47; plays back recorded speech at a certain rate, this playback rate 
is able to match the input rate so that expressions sound the same coming in and exiting the 
system). Van Thong does not teach whether the speech input is from an audio recording or 
computer generated text-to speech and determining whether a type of speech output to be 
provided at the speech output rate is the text-to-speech or the recorded speech output. Reynar 
discloses a multi-source input and playback utility that accepts inputs from various sources, 
transcribes the inputs as text, and plays aloud user-selected portions of the text is disclosed. The 
user may select a portion of the text and request audio playback thereof. The utility examines 
each transcribed word in the selected text. If stored audio data is associated with a given word, 
that audio data is retrieved and played. If no audio data is associated, then a text-to-speech entry 
or series of entries is retrieved and played instead. The system also provides for the utility may 
also speed up, slow down, or otherwise alter the TTS entry prior to playback in order to match 
the stored audio data. The utility may analyze the audio data waveform, extracting such 
information as speech speed, pitch, tone, and timbre. The utility may then alter these 
characteristics in the TTS entry in order to more closely parallel the sound of the TTS entry to a 
speaker's own speech patterns. It would have been obvious to one of ordinary skill at the time of 
the invention to modify the system of Van Thong to provide for text-to-speech audio and 
recorded audio, for the purpose of providing all necessary audio data of any desired words for 
use in the system. 

Claim 17. The combination of Van Thong and Reynar teaches, the machine-readable 
storage of claim 15, wherein the machine-readable storage is further programmed to adjust a rate 
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of text-to-speech synthesis to match the speech input rate if the type of speech output is text-to- 
speech. (Fig.l; The next module, the speech control module 19, controls the rate of speech 
depending on how fast the text is spoken and/or how fast the operator 53 types. Col 3 lines 
55-54; Alternatively, the speech playback rate may depend on the external synchronization 
source such as the text-input of an operator transcribing the recorded speech. Col 12 lines 
21 -23) 

Claim 18. The combination of Van Thong and Reynar teaches, the machine-readable 
storage of claim 15, wherein the machine-readable storage is further programmed to count 
alternate text available from a recorded output and of determine an audio file length to compute a 
default output rate (Alternatively, the speech playback rate may depend on the external 
synchronization source such as the text-input of an operator transcribing the recorded 
speech. Col 12 lines 21 -23) which is used to adjust a recorded output rate (Fig. 2 rate adjusted 
speech input 47) to match the input speech rate when the type of speech is recorded (Fig. 2 
input speech 17) and alternate text is available. (The desired target speech rate 37 may be a 
"predefined value" or depend on external synchronization, here the keyboard input Le. text 
available (i.e. real time transcribed text) 49. Col. 5 lines 1-3) 

Claim 19. The combination of Van Thong and Reynar teaches, the machine-readable 
storage of claim 15, wherein the machine-readable storage is further programmed to obtain an 
output word count from a transcription of a recorded speech output and determine an audio file 
length to compute a default output rate (Alternatively, the speech playback rate may depend 
on the external synchronization source such as the text-input of an operator transcribing 
the recorded speech. Col 12 lines 21 -23) which is used to adjust a recorded output rate (Fig. 2 
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rate adjusted speech input 47) to match the input speech rate when the type of speech is 
recorded (Fig. 2 input speech 17) and alternate text is unavailable (The desired target speech 
rate 37 may be a "predefined value" Le. text not available or depend on external 
synchronization, here the keyboard input (i.e. real time transcribed text) 49. Col. 5 lines 1- 
3) 

Response to Arguments 

3. Applicant's arguments with respect to claims 1 5 3-9, 10-15, and 17-19 have been 
considered but are moot in view of the new ground(s) of rejection. 

Conclusion 

Applicant's amendment necessitated the new ground(s) of rejection presented in this 
Office action. Accordingly, THIS ACTION IS MADE FINAL. See MPEP § 706.07(a). 
Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a). 

A shortened statutory period for reply to this final action is set to expire THREE 
MONTHS from the mailing date of this action. In the event a first reply is filed within TWO 
MONTHS of the mailing date of this final action and the advisory action is not mailed until after 
the end of the THREE-MONTH shortened statutory period, then the shortened statutory period 
will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 
CFR 1.136(a) will be calculated from the mailing date of the advisory action. In no event, 
however, will the statutory period for reply expire later than SIX MONTHS from the date of this 
final action. 
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examiner should be directed to Angela A. Armstrong whose telephone number is 571-272-7598. 
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If attempts to reach the examiner by telephone are unsuccessful, the examiner's 
supervisor, Patrick N. Edouard can be reached on 571-272-7603. The fax phone number for the 
organization where this application or proceeding is assigned is 571-273-8300. 

Information regarding the status of an application may be obtained from the Patent 
Application Information Retrieval (PAIR) system. Status information for published applications 
may be obtained from either Private PAIR or Public PAIR. Status information for unpublished 
applications is available through Private PAIR only. For more information about the PAIR 
system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR 
system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would 
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